Mining of Bilingual Indian Web Documents

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Building Bilingual Dictionaries from Parallel Web Documents

In this paper we describe a system for automatically constructing a bilingual dictionary for cross-language information retrieval applications. We describe how we automatically target candidate parallel documents, filter the candidate documents and process them to create parallel sentences. The parallel sentences are then automatically translated using an adaptation of the EMIM technique and a ...

متن کامل

Mining the Web for Bilingual Text

STRAND Resnik is a language independent system for automatic discovery of text in parallel translation on the World Wide Web This paper extends the prelim inary STRAND results by adding automatic language identi cation scaling up by orders of magnitude and formally evaluating perfor mance The most recent end product is an au tomatically acquired parallel corpus comprising English French documen...

متن کامل

Mining Domain Specific Words from Web Documents

Web pages provide not only plain text materials for training language models but also tag information for semantics annotation. The tags could be found either explicitly in the HTML documents or implicitly through the directory hierarchy of the documents, since the directory hierarchy can be regarded as a kind of classification tree for web documents, which assigns an implicit hidden tag to eac...

متن کامل

Mining Web Documents for Unintended Information Revelation

This research concerns web site information security. With an increasing number of documents being generated by different individuals and departments in organizations, there is a potential of releasing information which is inconsistent with the overall goals, objectives and operation of the organization. We refer to this as unintended information revelation (UIR). This paper focuses on progress...

متن کامل

Web Mining: Clustering Web Documents A Preliminary Review

Evidently there is a tremendous proliferation in the amount of information found today on the largest shared information source, the World Wide Web (or simply the Web). The process of finding relevant information on the web can be overwhelming. Even with the presence of today’s search engines that index the web it is hard to wade through the large number of returned documents in a response to a...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Procedia Computer Science

سال: 2016

ISSN: 1877-0509

DOI: 10.1016/j.procs.2016.06.103